SGML Documents: Where Does Quality Go?
نویسندگان
چکیده
Quality control in electronic publications should be one of the major concerns of everyone who is managing a project. Big projects, like digital libraries, try to gather information from a series of different sources: libraries, museums, universities, and other scientific or cultural organizations. Collecting and treating information from several different sources raises very interesting problems, one being the assurance of quality. Quality in electronic publications can be reflected in several forms, from the visual aspects of the interface, to linguistic and literary aspects, to the correctness of data. With SGML we can solve part of the problem, structural/syntactic correctness. SGML provides a nice way to specify the structure of documents keeping a complete separation between structure (syntax) and typesetting. Today there are lots of editors and environments that can assist the user producing well-formed and valid SGML documents (validating their structure). However, current software still gives the user too much freedom. The user has full control of the data being introduced, creating a margin for errors. In this context there are situations where pre-conditions over the information being introduced should be enforced in order to prevent the user from introducing erroneous data; we shall call this process data semantics validation. The idea is to constrain the values of some structural elements of a document according to its final purpose. This way the user (who writes the documents according to that DTD) will not have full control of his data; he will be forced to obey certain domain range limitations or certain information relationships. SGML does not have the necessary constructs to implement this extra validation task. In this paper we will present and discuss ways of associating a constraint language with the SGML model. We will present the steps towards the implementation of that language. In the end, we present a new SGML authoring and processing model which has an extra validation task: semantic validation. Along the paper we will show some case studies that could have their quality improved with this new working scheme.
منابع مشابه
Processing SGML Documents
SGML (Standard Generalized Markup Language) is an ISO Standard that specifies a language for document representation. The main idea behind SGML is to strictly separate the structure and contents of a document from the processing of that document. This results in application-independent and thus reusable documents. To gain the full benefit of this approach, tools are needed to support a wide ran...
متن کاملComplementary Approaches to Representing Differences Between Structured Documents
Structured documents Documents can be represented as structures with a hierarchical arrangement of text and non-text nodes, where nodes are labelled by category names such as “paragraph” and “section”. Representing documents this way is a natural consequence of using the Standard Generalized Markup Language (SGML) to encode the content and form of documents [10, 11, 7]. SGML is widely used. HTM...
متن کاملComparing of SGML documents
Documents can be represented as structures with a hierarchial arrangement of text and non-text nodes, where nodes are labeled by category names such as paragraph and section. Representing documents this way is a natural consequence of using the Standard Generalized Markup Language(SGML) to encode text documents which has many applications in different areas. There are many circumstances in whic...
متن کاملStructured storage and retrieval of SGML documents using Grove
SGML standardized in ISO 8879 [International Organization for Standardization (1986)] has been proliferated because it can provide various styles and transform documents on dierent platforms. The SGML document has logical structure information in addition to the contents. As SGML documents are widely used, there is an increasing demand for a storage and retrieval system to use the logical stru...
متن کاملThe Implementation of the Amsterdam SGML Parser
The Standard Generalized Markup Language (SGML), is an ISO Standard that specifies a language for document representation. This paper gives a short introduction to SGML and describes the Amsterdam SGML Parser and the problems we encountered in implementing the Standard. These problems include interpretation of the Standard in the places where it is ambiguous and the technical problems in parsin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Markup Languages
دوره 1 شماره
صفحات -
تاریخ انتشار 1999